Fuzzy K-Nearest Neighbor Based Dental Fluorosis Classification Using Multi-Prototype Unsupervised Possibilistic Fuzzy Clustering via Cuckoo Search Algorithm

Dental fluorosis in children is a prevalent disease in many regions of the world. One of its root causes is excessive exposure to high concentrations of fluoride in contaminated drinking water during tooth formation. Typically, the disease causes undesirable chalky white or even dark brown stains on the tooth enamel. To help dentists screen the severity of fluorosis, this paper proposes an automatic image-based dental fluorosis segmentation and classification system. Six features from red, green, and blue (RGB) and hue, saturation, and intensity (HIS) color spaces are clustered using unsupervised possibilistic fuzzy clustering (UPFC) into five categories: white, yellow, opaque, brown, and background. The fuzzy k-nearest neighbor method is used for feature classification, and the number of clusters is optimized using the cuckoo search algorithm. The resulting multi-prototypes are further utilized to create a binary mask of teeth and used to segment the tooth region into three groups: white–yellow, opaque, and brown pixels. Finally, a fluorosis classification rule is created based on the proportions of opaque and brown pixels to classify fluorosis into four classes: Normal, Stage 1, Stage 2, and Stage 3. The experimental results on 128 blind test images showed that the average pixel accuracy of the segmented binary tooth mask was 92.24% over the four fluorosis classes, and the average pixel accuracy of segmented teeth into white–yellow, opaque, and brown pixels was 79.46%. The proposed method correctly classified four classes of fluorosis in 86 images from a total of 128 blind test images. When compared with a previous work, this result also indicates 10 out of 15 correct classifications on the blind test images, which is equivalent to a 13.33% improvement over the previous work.


Introduction
Dental fluorosis-a condition where the appearance of the tooth enamel changes-results from an excessive intake of fluoride during the tooth development period. Fluorosis occurs in many regions of the world [1][2][3], with a global prevalence of over 20 million cases as estimated by [4]. Although fluoride is known for its benefits in preventing dental caries, its use requires a balance between caries protection and the risk of dental fluorosis. It is evident from the existing literature that excessive fluoride exposure can take place due to Table 1. Severity of dental fluorosis according to Dean's index and Yeesarapat et al. [14].

Dean's Index
Description of Tooth Enamel Yeesarapat et al.

Normal
Smooth, glossy, pale creamy-white translucent surface. Normal Questionable A few white flecks or white spots that cannot be determined as very mild or normal. Stage 1 Very mild Small, opaque, paper-white areas covering less than 25% of the tooth surface. This also includes teeth showing no more than about 1-2 mm of white opacity at the tips of the summits of the cusps of the bicuspids or second molars.

Stage 1
Mild Opaque white areas covering less than 50% of the tooth surface. Stage 1 Moderate All tooth surfaces affected; marked wear on biting surfaces; brown stains may be present. Stage 2 Severe All tooth surfaces affected; discrete or confluent pitting; brown stains present. Stage 3 In the dental fluorosis screening process, clinical grading of tooth enamel is subject to subjective biases by examiners due to various factors-for example, knowing the vicinity where a subject resides possibly hints at the subject's fluoridation status. Such a bias can be addressed by using a standardized image-based method, where examiners remotely evaluate clinical photographs without prior knowledge of the subject's location. However, the examination of clinical photographs is still prone to biases, because individual examiners inherently choose different grading thresholds. As a result, there has been interest in developing an image-based automated system for examining the severity of fluorosis. One of the pioneering works on dental fluorosis image analysis is by Pretty et al. [15], where the authors utilized image processing techniques to quantify dental fluorosis levels in fluorescence imaging and experimentally showed that the quantity has a good correlation with the TF index. McGrady et al. [16] also further showed that populations with different levels of fluoride exposure could be discriminated by fluorescence imaging. A dual-camera system that can simultaneously capture both a fluorescence image and a polarized white-light image was used by Liu et al. [17] as a part of an automatic fluorosis classification system. In their work, both image modalities were used to extract five-dimensional fluorosis feature vectors, and then TF index predictions were obtained by RUSBoost [18], with a decision tree as a base learner.
The existing literature on automatic fluorosis classification typically relies on fluorescence images acquired from a quantitative light-induced fluorescence (QLF) imaging device, as it can measure the percentage of fluorescence change in demineralized enamel, which becomes lower due to fluorosis. Other approaches based on techniques such as Raman spectroscopy [19] can also be found in the literature. However, those pieces of specialty equipment come with a disadvantage, due to their non-ubiquity. Alternatively, handheld digital cameras-including smartphone cameras-are more accessible to the public. According to [20], photographic assessments of dental fluorosis, where the photos were taken using a digital SLR camera, exhibited good agreement with clinical assessments using Dean's index. As a result, it is intuitively tempting to develop an automated dental fluorosis classification tool to assess the severity of dental fluorosis in photographic images of teeth. Such a device could help reduce examiners' workload in the screening process, which is normally time-consuming, laborious, and prone to human error.
Although photographic image analysis has been introduced to solve some dentistry problems such as dental plaque detection [20][21][22][23][24][25][26] in the past, a few studies have been conducted on the automatic dental fluorosis classification of photographic images. One example includes the work by Yeesarapat et al. [14], where the authors proposed an imagebased dental fluorosis classification system using multi-prototype fuzzy c-means (FCM) to classify dental fluorosis into four classes adapted from Dean's index: Normal, Fluorosis Stage 1, Fluorosis Stage 2, and Fluorosis Stage 3. Their Normal, Fluorosis Stage 2, and Fluorosis Stage 3 classes correspond to Normal, Moderate, and Severe in Dean's index, respectively, while Fluorosis Stage 1 was created to cover three classes: Questionable, Very Mild, and Mild, as summarized in Table 1. FCM was used to cluster six-dimensional pixel values in the red, green, and blue (RGB) and hue, saturation, and intensity (HIS) color spaces for each of three groups: normal white (either white or yellow), opaque white, and brown pixels. A total of 1600 prototypes were obtained, and the nearest prototype classifier was used to assign each pixel value to one of these three groups. The classification criteria based on the amount of pixel values found in each group were then applied to classify an image of tooth enamel into four dental fluorosis conditions. Their proposed method yielded a correct classification rate of 42.85% for the training set and 53.33% for the blind test set. However, the process required dentists to manually select the tooth regions in the image. Their model selection was also performed manually, and the number of clusters was suboptimal.
Inspired by the work by Yeesarapat et al. [14], this paper proposes an image-based automatic system for dental fluorosis classification based on image segmentation through multi-prototype unsupervised possibilistic fuzzy clustering (UPFC). Feature vectors composed of pixel values in the RGB and HSI color spaces were clustered into five classes: white, yellow, opaque, brown, and background, where the number of clusters was optimized using the cuckoo Search (CS) algorithm. A set of pixels from seven images were used for training, and the best prototypes were chosen by 10-fold cross-validation. After that, tooth segmentation was implemented based on these clusters through the fuzzy k-nearest neighbor (FKNN) method. The proportions of opaque and brown pixels in the tooth region were used to determine four stages of fluorosis: Normal, Stage 1, Stage 2, and Stage 3, with the same severity levels defined by Yeesarapat et al. [14]. We evaluated our proposed method on 128 blind test images taken from 128 subjects in both segmentation and classification tasks. In overall, the proposed system provided a correct dental fluorosis classification for 86 out of 128 images, which is 13.33% better than that obtained in the prior work [14].
Although this image-based fluorosis detection system does not include other tooth deficiency factors-for example, mineralized tissue loss causing cavitation-the dentist can use this system in a pre-screening process to grade the condition before treating the patient. According to the dentist, the treatment rules based on the tooth condition according to the Table 1 are as follows: 1.
If there are some opaque white areas (Stage 1 fluorosis), the treatment can be whitening or microabrasion.

2.
If there are a lot of opaque white areas and some brown areas (Stage 2 fluorosis), the treatment can be tooth restoration, e.g., tooth grinding with composite filling, composite veneer, or ceramic veneer.

3.
If there are a lot of brown areas (Stage 3 fluorosis), the treatment can be similar to the Stage 2 treatment. However, the ceramic used in the treatment might need to be more opaque to cover all of the underlying color. In addition, a restorative dentistry specialist is needed in this case, rather than a general practice dentist. However, this system can be used in rural areas where there are not enough dentists. Moreover, the pictures can be taken by non-professional personnel with no mouth retractor or the need for a process to dry the teeth. If any mild/severe cases are detected, those patients can be sent to the dentist to recheck for a definitive diagnosis of their teeth before the treatment.
The rest of this paper is organized as follows: Section 2 describes the related backgrounds-HSI color space, UPFC, CS, and FKNN. The dataset and the pipeline of our proposed method (i.e., clustering, segmentation, and classification) are described in Section 3, followed by the experimental results and discussions in Section 4. Finally, our concluding remarks are provided in Section 5.

Relevant Background
This section describes the four main ingredients of the proposed dental fluorosis classification system, including the transformation of the RGB color space to the HSI color space, the unsupervised possibilistic fuzzy clustering algorithm, the cuckoo search algorithm, and the fuzzy k-nearest neighbor algorithm.

HSI Color Space
The HSI color space [27] is a very important color space that separates color information from intensity information. The hue (H) component represents color information, and the human vision system can distinguish different hues as different colors. The saturation (S) component describes color information in terms of how much the hue is diluted with white light-or, in other words, the purity of the color. The intensity (I) component represents the amount of light, or the brightness, in an image. One interesting feature of the HSI color system is that it represents colors similar to how they are perceived by humans. The nonlinear transformation from the RGB color space to the HSI color space (H, S, I) is mathematically defined as follows [27]: where R, G, B ∈ [0, 1] represent the red, green, and blue components of the RGB color space, respectively, while H ∈ [0, 2π] and S, I ∈ [0, 1] represent the components of the HSI color space. In this paper, the color components of one pixel in both the RGB and HSI color spaces are considered as feature vectors.

Unsupervised Possibilistic Fuzzy Clustering (UPFC)
UPFC [28,29] is a hybrid fuzzy clustering method that integrates the benefits of fuzzy c-means clustering (FCM) [30] and the possibilistic clustering algorithm (PCA) [31]. FCM is a well-known fuzzy algorithm proposed by Bezdek that works by assigning a fuzzy membership degree to each sample in the dataset, according to a sample's distances to the cluster centers. A sample will have a high degree of membership to a cluster if it is close to that cluster's center, or vice versa. One major drawback of FCM is that it is sensitive to outliers or noises in the dataset. Several approaches have been proposed to address the noise sensitivity issue. For example, possibilistic c-means (PCM) [32]-a possibilistic approach to clustering proposed by Krishnapuram and Keller-tried relaxing the probabilistic constraints of FCM to lessen the problem.
Although PCM can improve robustness, some parameter computations still require the implementation of FCM. Yang and Wu [31] also argued that the parameters of PCM were hard to handle and, consequently, proposed another possibilistic clustering approach called the possibilistic clustering algorithm (PCA). PCA is considered to be an improvement over FCM and PCM. While FCM produces memberships, PCA produces possibilities. Its resulting membership function in the form of an exponential function makes it robust to noise; however, it is unfortunately sensitive to the initialization and possible generation of coincident clusters. As a result, UPFC [28] was proposed as an extension of the PCA. It simultaneously generates both memberships and possibilities, and it also overcomes the noise sensitivity problem in FCM and the coincident cluster problem in PCA.
The UPFC algorithm can be briefly described as follows: Let X = x 1 , x 2 , . . . , x N S be a dataset of N S samples, where each sample is a p-dimensional vector (R p ). Suppose that the number of clusters is N C , and let C = c 1 , c 2 , . . . , c N C be the set of cluster centers. The goal of UPFC is to minimize the objective function with the constraints where µ ij,FCM and µ ij,PCA are the fuzzy membership value and the possibilistic value of vector x i in cluster j, respectively. The weighting exponents m > 1 and n > 1 are called the fuzzifier and the typicality, respectively. The multipliers a > 0 and b > 0 define the relative importance of the fuzzy membership and possibilistic values, respectively. The first term in (2) minimizes the distances from the feature vectors to the cluster centers, while the second term is added for the purposes of partition entropy (PE) and partition coefficient (PC) validity indices [31]. When b = 0, UPFC simply becomes PCA. The objective function (2) can be iteratively solved by Algorithm 1, beginning with the prototype initializations. For each iteration, the fuzzy membership values, the possibilistic degrees, and the cluster centroids are updated using (3), (4) and (5), respectively, until the computed cluster centers converge.
Furthermore, as suggested in [31], the parameter β > 0 in (2) results in the update Equation (4), where it measures the degree of separation of the dataset. Therefore, it is reasonable to define β as a sample variance computed from the distances between data samples and the vector of sample means (x avg ) of the dataset, as shown in (6).

Cuckoo Search (CS)
Cuckoo search (CS) [33,34], a nature-inspired metaheuristic algorithm proposed by Yang and Deb in 2009, is based on the obligate brood parasitic behavior of some cuckoo species in combination with the Lévy flights behavior of some birds and fruit flies. Naturally, some cuckoo species deploy an aggressive reproduction strategy by laying their eggs in the nests of other host birds. A host bird may be of another species. If it discovers that the eggs in its nest are not its own, it might throw the alien eggs away, or simply abandon its nest and build a new one. The CS algorithm, along with its variants, provides promising efficiency and has been shown to solve many real-world optimization problems [35,36].
According to [33], three idealized rules of CS can be described as follows: 1.
Each cuckoo lays one egg at a time and dumps its egg in a randomly chosen nest.

2.
The best nests with high quality of eggs will carry over to the next generations. 3.
The number of available host nests is fixed, and the egg laid by a cuckoo is discovered by the host bird with a probability p a ∈ [0, 1]. In this case, the host bird can either throw the egg away or abandon the nest and build a completely new nest.
Based on these rules, the generalized pseudocode of CS is summarized in Algorithm 2. Given an objective function to be minimized, each egg represents a solution stored in a nest, and a cuckoo egg represents a new solution. An egg is considered to be of higher quality (i.e., its solution is closer to an optimal value) if it has more resemblance to the host bird's eggs; thus, it has a higher chance to survive or become the next generation. A Lévy flight is used to model the search for a suitable nest by a cuckoo. The newly laid cuckoo egg (or the new solution) will replace the existing egg in the nest if it has higher quality. Furthermore, a fraction (p a ) of the worst nests are discarded, and new ones are randomly built. Algorithm 2. Cuckoo search via Lévy flights [33].
Get a cuckoo randomly by Lévy flights, evaluate its fitness F i 4: Randomly choose a nest (say j) among N nests 5: if F j < F i then 6: Replace j with the new solution 7: end if 8: A fraction p a of the worst nests are abandoned, and new ones are built 9: Keep the best solutions 10: Rank the solutions and find the current best 11: end while The CS algorithm uses Lévy flights-a type of random walk-to randomly update each cuckoo, with the following update equation: where x t+1 i is the i-th cuckoo at generation t + 1, and s is the step size of the Lévy flights. It can be computed based on Mantegna's algorithm [37] as follows: where α is a constant and β is the Lévy exponent, where 0 < β ≤ 2. The operator ⊕ is the element-wise product. x t i and x t best are the i-th cuckoo and the best cuckoo at generation t, respectively. Lévy(β) is the Lévy distribution, where its parameters u and v are the zero-centered normally distributed random variables, u ∼ N (0, σ 2 u ) and v ∼ N (0, σ 2 v ). Their standard deviations are defined as follows: where Γ is the standard gamma function.

Fuzzy K-Nearest Neighbor (FKNN)
To assign a feature vector to one prototype, the fuzzy k-nearest neighbor (FKNN) algorithm [38] is intensively applied in this paper. Let a set of vectors . . , N i } represents prototype j in class I, c is the number of classes, and N i is the number of prototypes in class i. Given an example x ∈ R p , after the k-nearest prototypes of input vector x are found, the membership value of x in class i-denoted as u i (x)-can be computed as follows [38]: where x 1 , x 2 , . . . , x K are the k-nearest prototypes, while m is the scaling parameter, whose value is set to 1.5 in this paper. The membership value of prototype x j in class i is denoted as u ij . Since each prototype belongs to a known class, the membership value u ij is simply defined as an indicator function: After the membership values of x are obtained for all classes, the classification rule is as follows:

Materials and Methods
The proposed automatic system for dental fluorosis classification is based on semantic segmentation of tooth enamel, where each pixel is labeled into five color classes: white, yellow, opaque, brown, and background. The white and light-yellow colors, in general, belong to healthy tooth enamel, while opaque-white and brown colors are considered to be indicators of dental fluorosis in Dean's index. The background class is assigned to all pixels not in the tooth region. A feature vector was formed by considering a pixel value in both the RGB and HSI color spaces, where the HSI features were computed by (1). The multi-prototype UPFC was used to generate clusters of feature vectors, and the optimal number of clusters was determined by the CS algorithm. One class of colors might consist of multiple clusters. The resulting clusters were used for tooth segmentation to extract a binary mask of the tooth region. Fluorosis severity was graded by evaluating the proportions of opaque and brown pixels in the tooth region. In this paper, four levels of fluorosis are considered, as defined by Yeesarapat et al. [14] in Table 1. The ground truth in the experiments was generated by a D.D.S. dentist with more than 10 years of experience.

Multi-Prototype Generation
Given a set of pixel values from the training images, the multi-prototype UPFC was used to generate clusters of RGB and HSI feature vectors. The overall training process was as presented in Algorithm 3, where the CS algorithm was used to determine the optimal number of clusters (N C ). The training process started by randomly initializing a set of N nests, N = {n 1 , n 2 , . . . , n N }, where n i is the i-th nest. Each nest n i contains a randomly chosen integer representing the number of clusters between L b and U b . The fitness value of a nest n i , denoted as F i , was defined by the squared error whereŷ is the expected clustering accuracy and y i is the predicted clustering accuracy for choosing the cluster size n i in the UPFC. Note that each cluster was assigned one label (out of five color labels) according to a majority vote, and the cluster accuracy y i could be computed by comparing the trained clusters with the pixels in the validation set. The experiment described in Section 4.1 used 10-fold cross validation-one fold for training and the remaining folds for validation. Therefore, for N nests, we had a set of fitness values, F = {F 1 , F 2 , . . . , F N }. Among all nests, one nest n j was randomly selected to obtain cuckoos with Lévy flights, as well as their fitness F j . A fraction p a of the worst nests were discovered and abandoned according to a binary random vector whose element was P i ∈ {0, 1}, where for i = 1, 2, . . . , N. Here, rand is a random number in the range [0, 1], and p a is known as the probability of discovery. A nest was rebuilt when P i = 1; otherwise, it was kept for the next generation. This process was repeated until the minimum fitness of the best nest was less than a specified threshold value ε or the number of iterations exceeded T max . Finally, the optimal clusters or multiple prototypes were obtained. Randomly choose the number of clusters n i for each nest as shown in (13)  3: Get the best nest n i and its fitness F i for the current best nest 4: Get cuckoos by Lévy flights using (7) to (10) 6: Calculate the fitness F j using (14) by performing UPFC on each nest. Keep the centroids 7: A fraction p a of the worst nests are abandoned, and new ones are built 8: Calculate the fitness F j using (14) by performing UPFC on each nest. Keep the centroids 9: if F j < F i then 10: Replace F i with F j and keep n j as the new best nest 11: end if 12: end while 13: return the centroid of each prototype

Dental Fluorosis Classification
The multiple prototypes obtained as described in Section 3.1 were used for tooth segmentation to separate teeth from gums. The severity of fluorosis was later classified based on the proportions of opaque and brown pixels in the tooth region.
To perform tooth segmentation, the pixel values of a given image were first computed against all prototypes and then assigned to one of five classes (white, yellow, opaque, brown, and background) using FKNN with K = 1 according to (11). Since the tooth region was composed of white, yellow, opaque, and brown pixels, a binary mask of the tooth region could be created. Morphological operators, including opening and dilation, were further used to remove undesired artifacts and enhance the tooth areas. After that, in the resulting binary image, if there were pixels in the tooth area that were misclassified as the background, these pixels would be reclassified again using FKNN with K = 5. The segmented binary mask (tooth pixel vs. background) was evaluated for its accuracy as described in Section 4.2.
An extracted tooth region could be used to determine the severity of fluorosis based on the classes of pixel values that fell under the region. Among the four classes, white and yellow pixels are natural colors of normal teeth, so they were considered together as the class of white-yellow pixels. On the other hand, opaque and brown pixels were apparent indicators to quantify fluorosis. As a result, further image segmentation was performed only in the tooth region by considering three classes of pixels: white-yellow, opaque, and brown.
Since fluorosis could be graded by the numbers of opaque and brown pixels appearing in the tooth region, and there might still exist a few tiny areas of opaque or brown pixels after the previous image processing steps, we further processed those tiny objects by removing them if their areas were less than 0.55% of the tooth region for the opaque areas, or less than 0.05% of the tooth region for the brown areas. The resulting tooth segmentation was also evaluated as described in Section 4.2 for pixel accuracy in predicting white-yellow, opaque, and brown pixels.
Upon completion of the image segmentation task, the numbers of white-yellow, opaque, and brown pixels in the tooth region could be determined. As a result, we designed a fluorosis classification rule (Algorithm 4) based on these quantities, where r opaque and r brown are the percentages of opaque and brown pixels in the tooth area, respectively. The idea behind this rule closely followed the enamel description in Table 1, i.e., the size of opaque and brown areas increased as fluorosis became more severe. Furthermore, Stage 2 and Stage 3 fluorosis not only involved brown pixels alone, but were also associated with a reasonably large area of opaque pixels. The choices of parameters used in this rule are discussed in Section 3.3. Its experimental results are shown in Section 4.3. Although this rule was motivated by Yeesarapat et al. [14], there was one key aspect of difference, as discussed in Section 4.4.

Dataset and Parameter Settings
To evaluate the performance of the proposed fluorosis classification algorithm as well as its tooth segmentation steps, we used the dataset of Yeesarapat et al. [14], which was collected by the Intercountry Centre for Oral Health (ICOH), Ministry of Public Health, from children in the rural areas of Chiang Mai province, Thailand. The images were taken with a RICOH Caplio RX camera without any advance preparation. In the study of Yeesarapat et al., experiments were conducted with a total of 22 images, where 7 images and 15 images were assigned for the training and test sets, respectively. In this paper, we used the same 7-image training set for training the UPFC. Furthermore, an additional 113 images were added to the original 15 test set images, resulting in a blind test set of 128 images. Each collected image was taken from each child. Although the number of teeth in each image was different, according to the dentist, this does not affect the performance of the fluorosis stage detection. This is because the detection rule is set based on the treatment condition (mentioned in the Introduction section), where the ratio of the opaque or brown area to the whole teeth area is considered. In addition, this system can be used as a pre-screening system before the patient is sent to see the dentist.
All images were cropped to the mouth region, to clearly show the teeth of each subject, where the final image sizes spanned from 815 × 517 to 1784 × 825 pixels. In addition, these collected images had varied resolutions to show that the proposed system is size-and resolution-invariant. One expert-a dentist with more than 10 years of experience-was asked to grade the severity of fluorosis in all of the images: Normal (no fluorosis), Stage 1 (questionable/mild), Stage 2 (moderate), and Stage 3 (severe). The same expert also provided five classes of pixel-level labels: white, yellow, opaque, brown, and background, where the background encompasses lips, tongues, and gums. The white class also includes reflected light spots in the images.
The parameter settings for the UPFC and cuckoo search in the experiment are shown in Table 2. However, for the fluorosis classifier, the parameters were manually set as follows: θ 1 = 0.05, θ 2 = 0.1, θ 3 = 0.3, δ = 0.007. The choice of these parameters was based on practical observations. If there was a very small number of opaque pixels (≤ 5%), or a slightly larger number of opaque pixels (≤ 10%) together with a very small number of brown pixels (δ ≤ 0.7%), the image would be graded as Normal. As the number of opaque pixels became higher, fluorosis became more severe, as seen by the values θ 1 < θ 2 < θ 3 . A large number of opaque pixels (θ 3 > 30%) but a small number of brown pixels (δ ≤ 0.7%) was classified as Stage 2, while the most severe level was indicated by a large number of brown pixels (δ > 0.7%). These parameter settings were based on the treatment condition according to the experienced dentist mentioned in the Introduction section. The performance [39,40]

Experiments on Multi-Prototype Generation
A subset of pixels from the training images, in both the RGB and HSI color spaces, were split into 10 folds. In total, there were approximately 4000, 4000, 2000, 1000, and 1800 selected pixels labeled as white, yellow, opaque, brown, and background, respectively. One fold was used to create multiple prototypes by the UPFC via cuckoo search (Algorithm 3), as described in Section 3.1, while the rest were used for the cluster validation. The parameters involved in these algorithms are described in Table 2. Since each pixel belonged to one of five color classes, each prototype was assigned to a particular class based on a majority vote in that particular cluster. The accuracy of the clustering was evaluated on the remaining nine folds, where we used FKNN with K = 1 to predict the label of each pixel. From the experiment, we found that the 10-fold cross-validation accuracy of pixel-wise classification was 91.81%. The performance of the best model from one fold yielded 92.96% pixel-wise accuracy. The number of prototypes generated by this model was 2500, where 662, 610, 414, 257, and 557 prototypes belonged to the white, yellow, opaque, brown, and background classes, respectively. The performances of this model are also presented in Tables 3 and 4, where we can observe that the value of FNR is particularly high for the opaque class. This is because its colors look relatively similar to the white and yellow classes, posing a challenge for fluorosis classification, since the opaque white color on tooth enamel can be a sign of fluorosis.

Experiments on Tooth Segmentation
The best multi-prototype model obtained from Section 4.1 was used to extract a binary tooth mask for each image, as described in Section 3.2. For each image in the training set, there were 2, 2, 2, and 1 images diagnosed as Normal, Stage 1, Stage 2, and Stage 3, respectively. The segmentation performances of the generated tooth masks for these images are presented in Table 5, where Acc mask is the pixel-wise accuracy of the binary mask, and Acc 3 is the accuracy of classifying each pixel in the tooth area into three classes: white-yellow, opaque, and brown pixels. Some of the training images and their segmentation results are shown in Figure 1. Table 5. Segmentation performance on each image in the training set, the percentages of opaque and brown pixels, and the comparison between predicted fluorosis stage vs. expert's opinion.  . The left column presents the expert's labels of opaque pixels and brown pixels, encircled in black and red, respectively. The center column presents the predicted binary tooth masks. The right column presents the predicted white-yellow, opaque, and brown pixels in blue, green, and red colors, respectively.

N1
For example, image F1_2 in Figure 1b was segmented to 280,326 white-yellow pixels and 22,737 opaque pixels with no brown pixels, resulting in the = 3 Acc 86.58% correct classification rate in Table 5.
According to Table 5, the mask Acc score for each image is relatively high. It is worth noting that the segmentation accuracy of the training images in Stage 2 (F2_1 and F2_2) is clearly lower than that of the other classes. This is because a portion of the tooth was shaded, causing its appearance to resemble the background class, as depicted in Figure  1c, where the leftmost incisors are segmented as background. The low 3 Acc score of image F2_2 (53.26%) also exhibited a case of segmentation errors, where a shadow in the image caused a pixel to be classified as a brown pixel.
To further evaluate the segmentation performance of the multi-prototype model, we tested the model with 128 blind test images. The results are shown in Table 5  . The left column presents the expert's labels of opaque pixels and brown pixels, encircled in black and red, respectively. The center column presents the predicted binary tooth masks. The right column presents the predicted white-yellow, opaque, and brown pixels in blue, green, and red colors, respectively.
For example, image F1_2 in Figure 1b was segmented to 280,326 white-yellow pixels and 22,737 opaque pixels with no brown pixels, resulting in the Acc 3 = 86.58% correct classification rate in Table 5.
According to Table 5, the Acc mask score for each image is relatively high. It is worth noting that the segmentation accuracy of the training images in Stage 2 (F2_1 and F2_2) is clearly lower than that of the other classes. This is because a portion of the tooth was shaded, causing its appearance to resemble the background class, as depicted in Figure 1c, where the leftmost incisors are segmented as background. The low Acc 3 score of image F2_2 (53.26%) also exhibited a case of segmentation errors, where a shadow in the image caused a pixel to be classified as a brown pixel.
To further evaluate the segmentation performance of the multi-prototype model, we tested the model with 128 blind test images. The results are shown in Table 5, where Acc mask and Acc 3 are reported per fluorosis class. Among the 128 images, there were 44, 41, 23, and 20 images diagnosed as Normal, Stage 1, Stage 2, and Stage 3, respectively, as shown in Table 6. The segmentation results on these images demonstrated a good generalization performance of the multi-prototype UPFC model in conjunction with the cuckoo search algorithm, as the average segmentation accuracy was 92.24% and 79.46% for Acc mask and Acc 3 , respectively. The results were also consistent with the results from the training set, reflecting that Stage 2 fluorosis was a challenging case. The reasons for some misclassifications might have been due to reflected light or malocclusion in the image caused by the image acquisition process, or the presence of some artifacts (such as saliva and dental plaque) in the tooth area.

Experiments on Dental Fluorosis Classification
The severity of fluorosis in each image was classified using Algorithm 4, where the percentages of opaque and brown pixels in the tooth area (r opaque and r brown , respectively) were obtained from the image segmentation. For example, the segmentation result of image F3_1 (Figure 1d) had 200,431 white-yellow pixels, 117,650 opaque pixels, and 5397 brown pixels; thus, we had r brown = 1.67%, as shown in Table 5.
The classification results from the training set and the blind test set are reported in Tables 5 and 7, respectively. The predicted accuracy on the training set was five images from a total of seven images (71.43%) for the training set. For the blind test set, the performance on each individual class in terms of accuracy was between 81% and 86%, as shown in Table 7. The classification accuracy for all four fluorosis classes was 86 images out of 128 images (67.19%) according to the confusion matrix in Table 8. Some examples of correct prediction of fluorosis classes are provided in Figure 2, where it can be observed that the percentages of opaque pixels (green color) increase from Figure 2a to Figure 2c, with no brown pixels present. The only figure that has brown pixels (red color) is Figure 2d. performance on each individual class in terms of accuracy was between 81% and 86%, as shown in Table 7. The classification accuracy for all four fluorosis classes was 86 images out of 128 images (67.19%) according to the confusion matrix in Table 8. Some examples of correct prediction of fluorosis classes are provided in Figure 2, where it can be observed that the percentages of opaque pixels (green color) increase from Figure 2a to Figure 2c, with no brown pixels present. The only figure that has brown pixels (red color) is Figure  2d.   The left column presents the expert's labels of opaque pixels and brown pixels, encircled in black and red, respectively. The center column presents the predicted binary tooth masks. The right column presents the predicted white-yellow, opaque, and brown pixels in blue, green, and red colors, respectively.
In addition, we also investigated misclassifications on the blind test set, some of which are summarized in Figure 3. According to Figure 3a, the misclassification was due to the malocclusion of the right lower incisor, where the shaded area was assigned to brown pixels. In Figure 3b, the expert annotated the pixels in the lower left molar as the white-yellow class; however, our system largely assigned these pixels to the opaque class. This segmentation error caused our system to incorrectly report higher severity of fluorosis as Stage 2, instead of Stage 1.   In addition, we also investigated misclassifications on the blind test set, some of which are summarized in Figure 3. According to Figure 3a, the misclassification was due to the malocclusion of the right lower incisor, where the shaded area was assigned to brown pixels. In Figure 3b, the expert annotated the pixels in the lower left molar as the white-yellow class; however, our system largely assigned these pixels to the opaque class. This segmentation error caused our system to incorrectly report higher severity of fluorosis as Stage 2, instead of Stage 1.

Comparison with Prior Works
The fluorosis classification rule used by Yeesarapat et al. [14] was rather similar to Algorithm 4. However, one major difference is in Line 3 of our algorithm, which did not appear in Yeesarapat et al.'s. In fact, our paper used the parameters = =  Figure 4, where the input image has no sign of fluorosis. According to the results of

Comparison with Prior Works
The fluorosis classification rule used by Yeesarapat et al. [14] was rather similar to Algorithm 4. However, one major difference is in Line 3 of our algorithm, which did not appear in Yeesarapat et al.'s. In fact, our paper used the parameters θ 1 = 0.05, θ 2 = 0.1, θ 3 = 0.3, δ = 0.007, while θ 2 = 0.05, θ 3 = 0.4, δ = 0.0065 were used by Yeesarapat et al. In order to investigate the importance of Line 3 in Algorithm 4, we illustrate a sample case in Figure 4, where the input image has no sign of fluorosis. According to the results of image segmentation, opaque and brown areas in this image contributed to 1.63% and 1.35% of the entire tooth region, as shown in Figure 4. Consequently, it was misclassified as Stage 3 by Yeesarapat et al. The cause of this misclassification was because pixels in the shaded area behind the lower lip were incorrectly labeled as brown pixels. In fact, any brown area is considered to be sign of fluorosis if there are also a fair number of opaque pixels in the tooth region. Consequently, incorporating Line 3 into our classification rule helped reflect this condition, allowing our algorithm to correctly classify Figure 4 as Normal.  One contribution of our paper over the work of Yeesarapat et al. is in the creation of binary tooth masks through the UPFC, CS, and FKNN algorithms, instead of using manual selection. In addition, to demonstrate another improvement of our method over that of Yeesarapat et al., we compared the classification performance on the 7 training images and 15 blind test images used by Yeesarapat et al., as reported in Table 9. It should be noted that their training images were the same set used in our experiment. Their blind test images were also a subset of the 128 blind test images discussed in Sections 4.2 and 4.3. The results of fluorosis classification are shown in Table 9, where our model achieved agreement with the expert's opinion for 5 out of 7 images (71.43%) and 10 out of 15 images (66.67%) for the training and blind test sets, respectively. However, the model of Yeesarapat et al. only achieved agreement for 3 out of 7 images (42.86%) and 8 out of 15 images (53.33%) for the training and blind test sets, respectively. These results could be interpreted as 28.57% and 13.33% improvements for the training and blind test sets, respectively. Consequently, these results demonstrate that our approach to fluorosis classification is superior to that of Yeesarapat et al.

Conclusions
In this paper, we proposed an automatic system of dental fluorosis classification using six color features from the RGB and HSI color spaces as feature vectors. Unsupervised possibilistic fuzzy clustering (UPFC) was used to cluster these features, since each feature vector had a label in one of five classes-white, yellow, opaque, brown, One contribution of our paper over the work of Yeesarapat et al. is in the creation of binary tooth masks through the UPFC, CS, and FKNN algorithms, instead of using manual selection. In addition, to demonstrate another improvement of our method over that of Yeesarapat et al., we compared the classification performance on the 7 training images and 15 blind test images used by Yeesarapat et al., as reported in Table 9. It should be noted that their training images were the same set used in our experiment. Their blind test images were also a subset of the 128 blind test images discussed in Sections 4.2 and 4.3. The results of fluorosis classification are shown in Table 9, where our model achieved agreement with the expert's opinion for 5 out of 7 images (71.43%) and 10 out of 15 images (66.67%) for the training and blind test sets, respectively. However, the model of Yeesarapat et al. only achieved agreement for 3 out of 7 images (42.86%) and 8 out of 15 images (53.33%) for the training and blind test sets, respectively. These results could be interpreted as 28.57% and 13.33% improvements for the training and blind test sets, respectively. Consequently, these results demonstrate that our approach to fluorosis classification is superior to that of Yeesarapat et al.

Conclusions
In this paper, we proposed an automatic system of dental fluorosis classification using six color features from the RGB and HSI color spaces as feature vectors. Unsupervised possibilistic fuzzy clustering (UPFC) was used to cluster these features, since each feature vector had a label in one of five classes-white, yellow, opaque, brown, or background-based on the color of that pixel and whether the pixel was in the tooth area. We applied the cuckoo search algorithm to optimize the numbers of clusters in each class. A set of pixels from seven training images, trained with these algorithms, resulted in a set of optimal prototypes, where tooth segmentation was then performed using the fuzzy k-nearest neighbor (FKNN) algorithm, together with some morphological operations. We classified the stages of fluorosis into Normal, Stage 1, Stage 2, and Stage 3, based on the proportions of opaque and brown pixels in the tooth area. Our experimental results showed that the proposed method was superior to prior works, as it correctly classified four fluorosis classes for 86 out of 128 images in the blind test set. In addition, the average pixel accuracy of the segmented binary tooth masks was 92.24%, and the average pixel accuracy of teeth segmented into white-yellow, opaque, and brown pixels was 79.46%.
In future works, it will be possible to improve the performance of the proposed method by replacing the fluorosis classification rule with a learning algorithm. Increasing the size of the training set-especially with more variations in fluorosis conditions and light conditions-would also be another way to enhance the performance. Moreover, to improve the efficiency of the system, the results need to be confirmed with clinical tests. Otherwise, the fluorosis stage detections from the system might be mistaken for other enamel hypoplasias or pigmentations due to drugs or smoking, etc. In addition, to increase the performance of the application from the point of view of a homogeneous application, we could develop a system that requires a captured image to cover the vestibular surface of at least four upper and lower incisors. As our proposed model is lightweight, one practical direction would be to deploy our model to mobile devices and evaluate its real-world performance in both clinical and non-clinical settings.